Search CORE

50 research outputs found

Characterising University WLANs within Eduroam Context

Author: D. Kotz
J. Kim
M. Balazinska
M. Gast
M. Papadopouli
U. Kumar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

The eduroam initiative is assuming an ever growing relevance in providing a secure, worldwide roaming access within the university WLAN context. Although several studies have focused on educational WLAN traffic characterisation, the increasing variety of devices, mobility scenarios and user applications, motivate assessing the effective use of eduroam in order to sustain consistent network planning and deployment. Based on recent WLAN traffic traces collected at the University of Minho (Portugal) and University of Vigo (Spain), the present work contributes for identifying and characterising patterns of user behaviour regarding, for instance, the location and activity sector of users. The results of data analysis quantify the impact of network access location on the number of associated users, on the number and duration of sessions and corresponding traffic volumes. The results also illustrate to what extent users take advantage of mobility in the WLAN. Complementing the analysis on a monthly basis, a fine grain study of WLAN traffic is provided through the identification of users' behaviour and patterns in small timescales

CiteSeerX

Universidade do Minho: RepositoriUM

Crossref

Astronomy in the Cloud: Using MapReduce for Image Coaddition

Author: A. Connolly
B. Howe
J. Gardner
Jacob J.
K. Wiley
M. Balazinska
S. Krughoff
White T.
Y. Bu
Y. Kwon
Publication venue: 'University of Chicago Press'
Publication date: 05/10/2010
Field of study

In the coming decade, astronomical surveys of the sky will generate tens of terabytes of images and detect hundreds of millions of sources every night. The study of these sources will involve computation challenges such as anomaly detection and classification, and moving object tracking. Since such studies benefit from the highest quality data, methods such as image coaddition (stacking) will be a critical preprocessing step prior to scientific investigation. With a requirement that these images be analyzed on a nightly basis to identify moving sources or transient objects, these data streams present many computational challenges. Given the quantity of data involved, the computational load of these problems can only be addressed by distributing the workload over a large number of nodes. However, the high data throughput demanded by these applications may present scalability challenges for certain storage architectures. One scalable data-processing method that has emerged in recent years is MapReduce, and in this paper we focus on its popular open-source implementation called Hadoop. In the Hadoop framework, the data is partitioned among storage attached directly to worker nodes, and the processing workload is scheduled in parallel on the nodes that contain the required input data. A further motivation for using Hadoop is that it allows us to exploit cloud computing resources, e.g., Amazon's EC2. We report on our experience implementing a scalable image-processing pipeline for the SDSS imaging database using Hadoop. This multi-terabyte imaging dataset provides a good testbed for algorithm development since its scope and structure approximate future surveys. First, we describe MapReduce and how we adapted image coaddition to the MapReduce framework. Then we describe a number of optimizations to our basic approach and report experimental results comparing their performance.Comment: 31 pages, 11 figures, 2 table

arXiv.org e-Print Archive

Crossref

Automated migration of build scripts using dynamic analysis and search-based refactoring

Author: Adams B.
Antoniol G.
Balazinska M.
Bodhuin T.
Doval D.
Guttmann W.
Holzmann G.
Hunt G.
Kirsopp C.
Mitchell B. S.
Neagu A.
Neundorf A.
Vakilian M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 05/01/2014
Field of study

The efficiency of a build system is an important factor for developer productivity. As a result, developer teams have been increasingly adopting new build systems that allow higher build parallelization. However, migrating the existing legacy build scripts to new build systems is a tedious and error-prone process. Unfortunately, there is insufficient support for automated migration of build scripts, making the migration more problematic. We propose the first dynamic approach for automated migration of build scripts to new build systems. Our approach works in two phases. First, from a set of execution traces, we synthesize build scripts that accurately capture the intent of the original build. The synthesized build scripts are typically long and hard to maintain. Second, we apply refactorings that raise the abstraction level of the synthesized scripts (e.g., introduce functions for similar fragments). As different refactoring sequences may lead to different build scripts, we use a search-based approach that explores various sequences to identify the best (e.g., shortest) build script. We optimize search-based refactoring with partial-order reduction to faster explore refactoring sequences. We implemented the proposed two phase migration approach in a tool called METAMORPHOSIS that has been recently used at Microsoft

CiteSeerX

Crossref

Spiral - Imperial College Digital Repository

SRBench: A streaming RDF/SPARQL benchmark

Author: A. Arasu
A. Bolles
A.P. Sheth
C. Bizer
D. Le-Phuoc
D.F. Barbieri
E. Bouillet
E. Valle Della
J. Pérez
J.-P. Calbimonte
K. Whitehouse
M. Balazinska
O. Corcho
Y. Guo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

We introduce SRBench, a general-purpose benchmark primarily designed for streaming RDF/SPARQL engines, completely based on real-world data sets from the Linked Open Data cloud. With the increasing problem of too much streaming data but not enough tools to gain knowledge from them, researchers have set out for solutions in which Semantic Web technologies are adapted and extended for publishing, sharing, analysing and understanding streaming data. To help researchers and users comparing streaming RDF/SPARQL (strRS) engines in a standardised application scenario, we have designed SRBench, with which one can assess the abilities of a strRS engine to cope with a broad range of use cases typically encountered in real-world scenarios. The data sets used in the benchmark have been carefully chosen, such that they represent a realistic and relevant usage of streaming data. The benchmark defines a concise, yet omprehensive set of queries that cover the major aspects of strRS processing. Finally, our work is complemented with a functional evaluation on three representative strRS engines: SPARQLStream, C-SPARQL and CQELS. The presented results are meant to give a first baseline and illustrate the state-of-the-art

Crossref

Archivo Digital UPM